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Abstract 

The computation of eigenvalues of real symmetric tridiagonal matrices 
frequently proceeds by a sequence of QR steps with shifts. We introduce 
simple shift strategies, functions a satisfying natural conditions, taking each 
n x n matrix T to a real number o(T). The strategy specifies the shift to 
be applied by the QR step at T. Rayleigh and Wilkinson's are examples of 
simple shift strategies. We show that if a is continuous then there exist initial 
conditions for which deflation does not occur, i.e., subdiagonal entries do not 
tend to zero. In case of deflation, we consider the rate of convergence to 
zero of the (n, n — 1) entry: for simple shift strategies this is always at least 
quadratic. If the function a is smooth in a suitable region and the spectrum 
of T does not include three consecutive eigenvalues in arithmetic progression 
then convergence is cubic. This implies cubic convergence to deflation of 
Wilkinson's shift for generic spectra. The study of the algorithm near deflation 
uses tubular coordinates, under which QR steps with shifts are given by a 
simple formula. 

Keywords: Isospectral manifold, Deflation, Wilkinson's shift, Shifted QR algo- 
rithm. 
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1 Introduction 

Let T be the vector space of real symmetric tridiagonal matrices. Among the 
standard algorithms to compute eigenvalues of matrices in T are QR steps with 
different shift strategies: Rayleigh and Wilkinson are familiar examples (excellent 
references are |15j . [4], |13)h In this paper, we consider a more general context: 
we define simple shift strategies, which include the examples above and more, and 
discuss subtle aspects of their asymptotic behavior. 

More precisely, given a matrix T £ T and s £ R, write T — si = QR, if 
possible, for an orthogonal matrix Q and an upper triangular matrix R with positive 
diagonal entries. A shifted QR step is $(T, s) = Q*TQ. As is well known, shifted 
QR steps preserve spectrum and shape. For a real matrix with simple spectrum 
A = diag(Ai, . . . , A n ), let 7a C T be the set of matrices similar to A. A simple shift 
strategy is a function a : 7a ~> R satisfying the following two properties. 

(I) For all T £ Ta, <r{E n TE n ) = <r(T), where E n = diag(l, 1, . . . , 1, -1). 

(II) There exists C a > such that for all T £ Ta there is an eigenvalue Xi with 
W(T) - Ai| < C a \b{T)\, where b(T) = (T) (n , n _ 1} . 

This definition excludes algorithms which employ multi-shifts and extrapolation 
techniques, but accomodates the usual Rayleigh and Wilkinson's strategies. 



For technical reasons, we prefer the signed variant $*(T, s) — Q*TQ*, where 
now T — si = Q+R+, the orthogonal matrix Q+ has positive determinant and only 
the first n—l diagonal entries of the upper triangular matrix i?* are required to be 
positive. It is easy to see that either $(T, s) = $*(T, s) or $(T, s) = E n ^ t (T, s)E n 
(notice that E n TE n is obtained from T E T by changing the signs of the entries 
(n,n — 1) and (n — l,n)). As we shall see, the signed step is smoothly defined 
on a larger domain, and convergence issues for both kinds of step iterations are 
essentially equivalent. 

Simple shift strategies prescribe shifts: set F S (T) = <&*(T, s) and F a {T) = 
F<j(t){T). Algorithms iterate F a aiming at deflation, i.e., obtaining a matrix T E 7a 
with small |6(T)|. The deflation set (resp. neighborhood) is the set 2?a,o C 7a (resp. 
f a,c C 7a) consisting of matrices T for which 6(T) = (resp. \b(T)\ < e). A simple 
shift strategy a is deflationary if for any T E 7a and any e > there exists k for 
which F%(T) E Z>a,£- As is well known, Rayleigh's strategy is not deflationary and 
Wilkinson's is: our first result provides a context for these facts. 

Theorem 1 A continuous shift strategy a : 7a —> is not deflationary. 

We are thus led to consider the singular support S a C 7a of a shift strategy 
a, i.e., the minimal closed subset of 7a on whose complement a is smooth. For 
Rayleigh Sa is empty; for Wilkinson's strategy it consists of the matrices T E 7a 
with (T)„_x, n _i = (T) n> „. 

Numerical evidence brings up the question of whether the rate of convergence 
to zero of the sequence b(F^(T)) is cubic, in the sense that there is a constant C 
such that \b(F^ +1 (T))\ < C\b(F^(T))\ 3 for large k. It turns out that, for any shift 
strategy a, in an appropriate neighborhood of the deflation set I?a.0j each iteration 
of F a squeezes the (n,n — 1) entry quadratically. Away from the singular support 
S a , squeezing is cubic. 

Theorem 2 For e > small enough, the deflation neighborhood 2?A,e is invariant 
under F a . There exists C > such that, for all T E V A ,e, ^(^(T))] < C\b(T)\ 2 . 
Also, given a compact set K, C 2?a,c disjoint from S a PI £>a.o> there exists Ck: > 
such that, for all T G K, \b{F a {T))\ < Cic\b(T)\ 3 . 

Cubic convergence does not hold in general for Wilkinson's strategy. In |Sj, 
for A = diag(— 1,0, 1), we construct a Cantor-like set X C 7a of unreduced initial 
conditions for which the rate of convergence is strictly quadratic. Sequences starting 
at X converge to a reduced matrix which is not diagonal. For Rayleigh's shift, on the 
other hand, convergence is always cubic within invariant deflation neighborhoods. 

A matrix T G T with simple spectrum is a.p. free if it does not have three 
eigenvalues in arithmetic progression and a.p. otherwise. For a.p. free spectra, the 
situation is very nice: cubic convergence is essentially uniform on 7a- 

Theorem 3 Let A be an a.p. free matrix and a a shift strategy for which diagonal 
matrices do not belong to <S CT . Then there exist e > 0, C > and K > such that 
the deflation neighborhood X>a,<e is invariant under F a . Also, for any T E F>h,t, the 
sequence (F£(T)) converges to a diagonal matrix and the set of positive integers k 
for which |6(F^ +1 (T))| > C|6(^(T))| 3 has at most K elements. 

Still, the finite set of points in which the cubic estimate does not hold may occur 
arbitrarily late along the sequence (F*(T)). 

An a.p. matrix is strong a.p. if it contains three consecutive eigenvalues in arith- 
metic progression and weak a.p. otherwise. Under very mild additional hypothesis, 
b(T) converges to zero at a cubic rate also for weak a.p. matrices. Let Ca,o C 7a be 
the set of matrices T for which (T) ?M1 _i = (T)„_i i „_ 2 = 0. 
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Theorem 4 Let A be a weak a. p. matrix and a : 7a — > R a shift strategy for which 
Ca,q and S a are disjoint. Then there exists e > such that the deflation neigh- 
borhood 2?a i£ is invariant under F a and, for all unreduced T € £>A,e, the sequence 
(b(F^(T))) converges to zero at a rate which is at least cubic. More precisely, for 
each unreduced T € 2? a e there exist Ct, Kt > such that, for all k > Kt, we have 
\b{F^\T))\<C T \b(Fi{T))\\ 

In particular, the convergence of Wilkinson's strategy is cubic for weak a. p. 
matrices. However, uniformity in the sense of Theorem [3] is not guaranteed and the 
constants Ct and Kt depend on T. As in the case of the spectrum {—1, 0, 1}, we 
conjecture that if A is strong a. p. then there exists X C 7a of Hausdorff codimension 
1 of initial conditions T for which the rate of convergence is strictly quadratic. 

The proofs of the above results depend on few basic ideas. Signed shifted steps 
F S (T) are shown to be well defined for unreduced matrices and in an open neigh- 
borhood of the deflation set X>a,o- The compact set (manifold) 2?a,o splits into 
connected components T>\ consisting of matrices T 6 7a with (T) nj7l — A^. For 
small e, the deflation neighborhood splits into components T>\ which are thicken- 
ings of Z> A0 . 

Tubular coordinates provide a good understanding both of tubular neighbor- 
hoods T>\ and of shifted QR steps within these sets. A previous unpublished 
version of this paper ([5]) uses instead bidiagonal coordinates, defined in [7J, to 
prove some of the results presented here for Wilkinson's shift; these coordinates 
are also used in [7J to prove the cubic convergence of Rayleigh's shift. Bidiagonal 
coordinates consist of very explicit charts on the manifold 7a- On both coordinate 
systems, shifted QR steps are very simple. 

Sufficiently thin deflation neighborhoods are invariant under steps F a . Theorem 
[T]then becomes a connectivity argument: in a nutshell, there must be separatriccs 
in order to permit deflation to different components T>\ . 

Steps F„ are smooth whenever the shift strategy is, i.e., for T € T>\ e \ S a . 
At matrices To £ ^Xo on which F a is smooth, the map T H > b{F a (T)) has zero 
gradient. The symmetry of the shift strategy (condition (I)) yields a cubic Taylor 
expansion and therefore an estimate \b(F a (T))\ < C\b(T)\ 3 , settling Theorem[2] 

Height functions H : T>\ e — >• R are used for further study of the sequence 
(F*(T)). More precisely, for steps s near A;, Hi(F s (T)) > flj(T) provided T £ V\ t 
is not diagonal. A compactness argument then bounds the number of iterations for 
which F%(T) stays close to the singular support S a : this is essentially Theorem [3] 

For a. p. spectra, the situation is subtler. As numerical analysts know, shift 
strategies usually define sequences of matrices which, asymptotically, not only iso- 
late an eigenvalue at the (n, n) position but also isolate, at a slower rate, a second 
eigenvalue at the (n— 1, n — 1) position. This does not happen for the example in 
[8] where (F£(T)) n , n tends to the center of a three-term arithmetic progression of 
eigenvalues and {F£ (T)) n _i jn _2 stays bounded away from 0. On the other hand, 
Theorem 0] tells us that the weak a. p. hypothesis together with an appropriate 
smoothness condition guarantee cubic convergence. 

In Section 2 we list the basic properties of the signed shifted QR step. Simple 
shift strategies are introduced in Section 3, and the standard examples are shown 
to satisfy the definition. We define the deflation set 2?a,o and neighborhood 2?a,e 
in Section 4 and then set up tubular coordinates. The local theory of steps F s near 
2?A.o and the proof Theorem [5] are presented in Section 5. Section 6 is dedicated 
to Theorem [T] In Section 7 we construct the height functions H and then prove 
Theorem [3] The convergence properties of a. p. matrices in Theorem U are proved 
in Section 8. We finally present in Section 9 two counterexamples to natural but 
incorrect strengthenings of Theorems [3] and |U 
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2 QR iteration with shift and a variation 

For a matrix M, the QR factorization is M = QR for an orthogonal matrix Q and 
an upper triangular matrix R with positive diagonal. As usual, let SO(n) denote 
the set of orthogonal matrices with determinant equal to 1. The Q+R+ factorization, 
instead, is M — for Q+ G SO(n) and R± an upper triangular matrix with 

{R*)i,i > 0, i = 1, . . . , n — 1. A real n x n matrix M is almost invertible if its first 
n — 1 columns are linearly independent: notice that almost invertible matrices are 
dense within n x n matrices and form an open set. The diagonal matrix E n is such 
that (E n )i t i is 1 for i < n and —1 for i = n. 

Proposition 2.1 An almost invertible real matrix M admits a unique Q±R± fac- 
torization, with Q* and i?* depending smoothly on M . If M is invertible, it admits 
unique (smooth) factorizations M = QR — Q±R±. If det M > 0, the factorizations 
are equal, i.e., Q = Q± and R = R±. IfdetM < 0, Q = Q±E n and R = E n R*. If 
detM = 0, (i2*)„,„ = 0. 

Proof: Let M be almost invertible. Applying Gram-Schmidt with positive normal- 
izations on its first n — 1 columns we obtain the first n — 1 columns of both Q and 
R, as well as those of Q± and i?*. The last column v = Q*e„ of Q± is already well 
defined, by orthonormality and the fact that detQ* = f. Now, set R± — M(Q±)*. 
The positivity of R n , n specifies whether the last column of Q is v or —v. Smoothness 
is clear by construction. 

If M is invertible, det M — det Q* det i?* implies that the last diagonal entry of 
i?* has the same sign of det M: the relations between the factorizations then follow. 
If M is not invertible, the relation among determinants implies (R*) n ,n = 0. ■ 

Let T denote the real vector space ofnxn real, symmetric, tridiagonal matrices 
endowed with the norm ||T|| 2 = tr(T 2 ). For T G T, the subdiagonal entries of T are 
for i = 1, . . . , n — 1. The lowest subdiagonal entry of T is b(T) = (T')„ ) „_i. 
If all subdiagonal entries of T are nonzero, T is an unreduced matrix; otherwise, T is 
reduced. Notice that an unreduced tridiagonal matrix is almost invertible: indeed, 
the block formed by rows 2, . . . , n and columns 1, . . . , n — 1 is a an upper triangular 
matrix with nonzero diagonal entries, and therefore, invertible. 

We consider the shifted QR step and its signed counterpart, 

<S>(T,s) = Q*TQ, $+(T,s) = Q:TQ+, 

where T - si = QR and T - si = The pair (T, s) e T x WL belongs to the 

natural (open, dense) domains Dom($) and Dom($ t ) if T — si is invertible and 
almost invertible, respectively: clearly, the functions $ and are smooth in their 
domains. 

Lemma 2.2 For (T,s) e Dom($) (resp. Dom($*)J, we have $(T, s) G T (resp. 
$*(T, s) G T). The spectra ofT, $(T, s) and 3>*(T, s) are equal. In the appropriate 
domains, for T — si = QR = Q±R± and i = 1, 2, . . . , n — 1, 

(*(T, ,)) 4+M = I^±M±i (T) i+M , (*,(T, «)) i+M = (J *;^ M+1 (T) i+M . 

Thus, the top n — 2 subdiagonal entries of T , <i>(T, s) and $ + (T, s) /lave i/ie same 
sign; also, sign(T) n ,„_i = sign ($(T, s)) n ,„_i . 
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Proof: We prove the statements for $*; the others are then easy. 

For a pair (T, s) € Dom($) C Dom($ t ), there are two expressions for $*(T, s): 

$*(T, s) = QITQ* = R±TR-\ where T - si = Q+R*. 

From the first equality, $*(T, s) is symmetric and from the second, $*(T, s) is an 
upper Hessenberg matrix so that $*(T, s) G T is similar to T. More generally, for 
(T, s) e Dom($ t ) we still have 

**(T, s) = QITQ+, $*(T, - i?*T 

and therefore $*(T, s) € T is similar to T. Compute the + entry of the second 
equation above to obtain ($*(T, s))i+i,j = completing 

the proof. ■ 

The following result describes the behavior of <I>* at points not in Dom($), which 
will play an important role throughout the paper. 

Lemma 2.3 If (T, s) G Dom($*) \ Dom($) then 

6($*(T, s)) = ($*(T, s))„,„-i = 0, ($*(T, «))„,„ = s. 

j4i a point (T, s) G Dom($ 4 ) im£/i 6(T) = and s = (T)„.„ we /lave grad(foo$^) = 0. 

Proof: Since T — si = Q*i?* = R^Qt is not invertible then (i?*)„.„ = and 
therefore i?*e„ = 0. Thus v = (Ql)~ 1 e n — Qe n satisfies (T — sl)v — 0. We then 
have $*(T, s)e„ = Q*TQe n = Q*Tv = Q*(sv) = se n , proving the first claim. For 
the second claim, since T — si is almost invertible, (i?*)^ > for i < n. From the 
previous lemma, 

(bo^)(T,s)= h(T); 

if 6(T) = and s — (T)„.„ then (R±) n ,n = and & o is a product of two smooth 
functions, both zero, yielding grad(6 o $*) =0. H 

Recall that symmetrically changing the signs of subdiagonal entries does not 
change the spectrum of a matrix in T . Let £ denote the set of signed diagonal 
matrices with ±1 along the diagonal entries; in particular, E n G £. The operation 
of changing subdiagonal signs, i.e., of conjugation by some E G £, behaves well with 
respect to $ and $*. 

Lemma 2.4 Let E G £. The domains Dom($) and Dom($ t ) are invariant under 
conjugation by E and 

$(ETE, s) = £$(T, s)E, <S>±(ETE, s) = E<S>±(T, s)E. 

If det(T - si) > then $(T,s) = $*(T,s); z/ det(T - si) < 0, $(T,s) = 
E n $ i <(T,s)E n ; ifdet(T-sI) = and (T, s) G Dom($ t ), then 6($*(T,s)) = 0. 

Proof: For (T, s) G Dom($), the matrices T — si and E(T — sI)E are both invert- 
ible. The QR factorization T - si = QR yields ETE - E(sI)E = (EQE) (ERE), 
preserving the positivity of the diagonal entries of the triangular part, so 

<f>(ETE,s) = (EQE)* ETE (EQE) = EQ*TQE = E<S>(T,s)E. 

The argument is similar for <I>*. The claims for T — si invertible follow from the 
relation between Q and in Proposition ^. II the case det(T—sI) = is a repetition 
of Lemma [ 
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We are only interested in the case when the spectrum of T is simple, since a 
double eigenvalue implies reducibility. Let A be a real diagonal matrix with simple 
eigenvalues Ai < ••• < A n . Define the isospectral manifold 

7a = {Q*AQ, Q G SO{n)} n T, 

the set of matrices in T similar to A. The set 7a C T is a real smooth manifold 
f [14): [7] describes an explicit atlas of 7a)- Since either version of shifted QR step 
preserves spectrum, restriction defines smooth maps $ : (7a xR)fl Dom($) — > 7a 
and : (7a x K) n Dom($ t ) -» 7a- 

Still in 7a, it is convenient to consider the step F S (T) = &+{T, s). For s not an 
eigenvalue of A, the domain of F s is 7a- The natural domain for F\ t instead is the 
deflation domain 2? A , the open dense subset of 7a of matrices T for which T — Aj/ 
is almost invertible. In other words, T G T>\ if and only if Ai is an eigenvalue of the 
lowest irreducible block of T. 

The definition of the step F s differs from the usual one in that we use <1>* instead 
of Given Lemma l2~4l considerations about deflation are unaffected and our choice 
has the advantage of being smooth (and well defined) in T>\. 

The (z-th) deflation set is 

Z>a,o = {T G Ta I b(T) = 0, (T) Bl „ = Ai} . 

Since the spectrum of A is simple, T>\ C T>\. Also, if i ^ j then T>\ n 2? A = 0. 
We saw in Lemma [2731 that when the shift is taken to be an eigenvalue, a single step 
deflates a matrix, i.e., that the image of F\ t is contained in T>\ : we shall see in 
Proposition 14. 1 1 that this image is in fact equal to T>\ . 

3 Simple shift strategies 

Quoting Parlett [13], there are shifts for all seasons. The point of using a shift 
strategy is to accelerate deflation, ideally by choosing s near an eigenvalue of T. A 
simple shift strategy is a function a : 7a K such that: 

(I) for all T G 71, a(E n TE n ) = a(T); 

(II) there exists C a > such that for all T G 7a there is an eigenvalue Ai with 
HT) - A,| < C CT |6(T) | - 

In particular, if T G T>\ then £j(T) = Ai. The siep associated with a (simple) 
shift strategy a is defined by F a (T) — F a ( T )(T). The natural domain for F a 
is the set of matrices T for which T — a(T)I is almost invertible. From Section 
2, it includes all unreduced matrices and open neighborhoods of each deflation set 
T>\ . We shall also see in Section 6 that it contains a dense open subset U\, e of 7a 
invariant under F a . A more careful description of this domain will not be needed. 

We leave to the reader the verification that Rayleigh's shift p(T) — (T) n ,n is a 
simple shift strategy. Denote the bottom 2x2 diagonal principal minor of a matrix 
T G T by T: Wilkinson's shift oj(T) is the eigenvalue of T closer to (T) n<n (in case 
of draw, take the smallest eigenvalue). 

Lemma 3.1 The function d is a simple shift strategy with C u — 2v / 2. 

Proof: Condition (I) follows from the fact that changing signs of off-diagonal entries 
of a 2 x 2 matrix does not change its spectrum. For (II) , apply the Wielandt-Hoffman 
theorem to the 2x2 trailing principal minors of T and S = T — b(T)B to deduce 
that \(T) n}n — to{T)\ < V2 b(T). Again from Wielandt-Hoffman, now on S as in 
Proposition H31 \(T) n ,n - \ t \ < V2 b(T) and \w(T) - A 4 | < 2^/2 b(T). U 
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Another example of shift strategy, the mixed Wilkinson- Ray leigh strategy, uses 
Wilkinson's shift unless the matrix is already near deflation, in which case we use 
Rayleigh's: 

(T) = l^"'"- 1 ' < e ' 

1 ' \ U (T), \(T) ntn _!\ > e; 

here e > is a small constant. 

Shift strategies are not required to be continuous and lj is definitely not. For a 
shift strategy tr, let S a C 7a be the singular support of <r, i.e., a minimal closed set 
on whose complement a is smooth. For example, is the set of matrices T 6 7a 
for which the two eigenvalues u>-(T) and w+(T) of T are equidistant from (T)„ / , l , 
or, equivalently, for which (T) ni „ = (T) n _i. n _i. The set 5 CT will play an important 
role later. 

We consider the phase portrait of F u for 3x3 matrices. In this case, the reader 
may check that the domain of is the full set 7a- Let J a C 7a be set of Jacobi 
matrices similar to A, i.e., matrices T £ 7a with strictly positive subdiagonal entries. 
It is known ([2], [TT]) that the closure J7a C 7a is diffeomorphic to a hexagon. The 
set J7a is not invariant under F^ but we may define Fu(T) with F u : J a — > »7a 
by dropping signs of subdiagonal entries of F u (T) . As discussed above, this rather 
standard procedure is mostly harmless. 

Two examples of F u are given in Figure [TJ which represent 3 \ for the A = 
diag(l, 2, 4) on the left and A = diag(— 1, 0, 1) on the right. The vertices are the six 
diagonal matrices similar to A and the edges consist of reduced matrices. Labels 
indicate the diagonal entries of the corresponding matrices. Three edges form T>\ n 
J A'- they alternate, starting from the bottom horizontal edge on both hexagons. The 
set S fl Jh is indicated in both cases. 




Figure 1: The phase space of Wilkinson's step for n = 3. 

Vertices are fixed points of F u and boundary edges are invariant sets. A simple 
arrow indicates the motion of the points F*(T) along the edge. Points T on an arc 
with a double arrow are taken to a diagonal matrix in a single step: the arc points 
to F U (T). Arcs marked with a transversal segment consist of fixed points of F u . 

Points on both sides of <S W are taken far apart: there is essentially a jump 
discontinuity along . From Theorem [2j the decay of the bottom subdiagonal 
entry under Wilkinson's step away from n 2?a,o is cubic. As discussed in jS], 
near 5 W n I?a,o this decay is quadratic, but not cubic. For the left hexagon, cubic 
convergence occurs in the long run because the sequence F*(T) stays close to this 
intersection only for a few values of k, illustrating Theorem [3J 

In the case A = diag(— 1, 0, 1), the bottom edge consists of fixed points. This 
gives rise to a special asymptotic behavior ([S]): the (fixed) point labeled by (0, 0, 0) 
is actually the limit of a collection of sequences (F*(T)) for which the convergence 
is strictly quadratic. 
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4 Tubular coordinates 

We collect a few basic facts about shifted QR steps. 

Proposition 4.1 If s is not an eigenvalue of A, the map F s : 7a 7a is a 

diffeomorphism. The image of F Xi : T> A — > 7a is D A0 . The restriction ^A 4 |i>* ■ 
"D\ — > T>\ q is a diffeomorphism. 

Proof: If s is not an eigenvalue, compute by factoring T — si as i? 

upper triangular with the first n — 1 diagonal entries positive and Q € SO(n): we 
claim that F s (To) = T for To = Qi? + si, proving that -F s is a diffeomorphism. 
Indeed, Qi? = T Q - si is a factorization and thus -F S (T ) = Q*T Q = T. 

From the last sentence of Section 2, the image of F\ t is contained in C V\. 
The fact that the restriction of F\ i to T>\ is a diffeomorphism is proved as in the 
previous paragraph. ■ 

Commutativity of steps is well known and related to the complete integrability 
of the interpolating Toda flows ([5], [10] . [12] . [T5]V For the reader's convenience 
we provide a proof. 

Proposition 4.2 S'teps commute: F sa oF sl = F sl oF SQ in the appropriate domains. 

The domain of F So o F Sl = F Sl o F So is 7a if neither so nor si is an eigenvalue, 
T>\ if so = Ai and si is not an eigenvalue (or vice- versa) and the empty set in the 
rather pointless case so = Ai, s\ = Xj, i ^= j. 

Proof: We prove commutativity only when so and si are not eigenvalues; the other 
cases follow easily. Consider factorizations 

T-s I = Q Ro, T-s 1 I = Q 1 R 1 , 
(T - s I)(T - axl) = (T- Sl I)(T - s 7) - Q 2 i? 2 . 

For F S0 (T) -s 1= Q* {T - Sl )Q = Q 3 R 3 , we have F Sl {F So {T)) = Q* 3 F So (T)Q 3 = 
Q* 3 QoTQoQ 3 - Thus 

Q* (T - Sl )Q R = Q*(T - Sl I)(T - s I) = Q* Q Q 2 R 2 = Q3R3R0 

and therefore Q* Q 2 = Q 3 and F so (F S1 (T)) = Q* 2 TQ 2 . U 

Recall that a map Hil-^Fclisa projection if II(X) = Y and II o n = II. 
The map F\ i : V l A — > T>\ is not a projection but can be used to define one: the 
canonical projection IT : T>\ — > T> A , 

U i (T) = (F Xi \ vi J-\F Xi (T)). 

Proposition 4.3 The map Hi is indeed a smooth projection which commutes with 
steps. More precisely, Hi(F s (T)) — F s (Tli(T)) provided s is not an eigenvalue of A 
different from Ai . 

Proof: The map IT is clearly smooth and, for T £ T>\ , we have 

W) = {F^\ VI( )- 1 {F K {T))=T, 
proving that IT is a projection. Commutativity follows from Proposition 14. 21 ■ 
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For a diagonal matrix A with simple spectrum and e > 0, the deflation neigh- 
borhood 2?A,e C 7a is the closed set of matrices T £ 7a with \b(T)\ < e. This 
notation is consistent with 2?a,o for the deflation set. As we shall see in Proposi- 
tions 03] an d GHU for sufficiently small e > the set T>A, e has connected components 
T>\ c C T>\, T>\ t D T>\ , which are invariant under steps F s for shifts s near Aj, 
i.e., F S (D\ ) C 2? A e . The sets T>\ are therefore also invariant under F a . 

Denote the distance between a matrix T and a compact set of matrices Af by 
dist(T,7V) = xavas^M ||T — Let 7 = min^j |A, — Xj\ be the spectral gap of A 
and B = e n &* n -\ + e„_ie* . 

Recall that if Af is a submanifold of codimension k oi A4 then a closed tubular 
neighborhood of A/" consists of a closed neighborhood Af e of A/" and a diffeomorphism 
C : K -> AT x B* with C(ar) = 0, 0) for x G A/" (here C R k is the closed ball 
of radius e around the origin). Given x G A/", the preimage £ _1 ({a;} x B(f) is a 
manifold with boundary of dimension k, the /i6er through x. We now construct 
tubular neighborhoods of the deflation sets V\ ; here the codimension is k = 1. 

Proposition 4.4 i?ac/i 2? A (1 Ta is a compact submanifold of codimension 1 di/- 
feomorphic to 7a 4 j where Aj = diag(Ai, . . . , Aj_i, Aj+i, . . . A„). There exists etub > 
smc/i i/ia£ /or e G (0, etub)- 

(a,) t/ie connected components T> A ofDj^.e consist of matrices T € 2?A,e /or which 
|(T)„, n -Ai| < V2e; 

(7>j tfce map C = Z>a )6 -> ^A.o x h e > e l f™ en b V C( T ) = ( n ;( r ): b ( T )) ls a d °sed 
tubular neighborhood o/£>a,0/ 

(c) there is a constant C(, > smc/i that for all T € T>\ , 

\b(T)\ < dist(T,23i i0 ) < HT-n^T)]] < C fc |6(T)|. 

Proof: We first show that the gradient of the restriction b\j- A at a point Tp € 2?a,o 
is not zero. Consider the characteristic polynomial along the line T-p + tB: this is 
a smooth even function of t and therefore B is tangent to 7a at T-p, the point on 
which t = 0. On the other hand, the directional derivative of b along the same line 
equals 1. Thus T>a,o C 7a is a submanifold of codimension 1. The diffeomorphism 
with 7a; takes T to T, the leading (n — 1) X (n — 1) principal minor of T. 

Assume e < 7/(2^2). Consider matrices T G £>a, £ and S = T - b(T)B, so that 
(T) n .n is an eigenvalue of 5. By the Wielandt-Hoffman theorem, there exists an 
index z for which |(T)„ j71 — Af| < v2e, defining the sets 2? A e (at this point we do 
not yet know that T>\ e is connected) . 

For Tp G O^q, the derivative DHi(Tp>) equals the identity on the subspace 
tangent to T>\ and has a kernel of dimension 1. Thus, for sufficiently small etub 7 
item (b) holds. This also proves that each T>\ t is connected, completing the proof 
of item (a). 

The first two inequalities in (c) are trivial. Now 

||T - m(T)\\ = WC'mr), b(T)) - c^QUCnM < Cb\b(T)\, 

where the derivative of £ _1 (Tx>, S) with respect to the second coordinate is bounded 
by Ct on the compact set 2?a,o x [—etub, etub]- B 

The diffeomeorphism £ defines tubular coordinates for T G T>\ t : the matrix 
Ilj(r) G V\ q « 7a ; and b{T). Under tubular coordinates, QR steps with shift are 
given by a simple formula. 
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Corollary 4.5 Consider A, i and e G (0, etub)- Then 

(oFsoC 1 : V\ Q x [-e, e] ->■ © A>0 x [-e, e] 

(T,6)^(f s (T) ft) 

\ l,-ft* / )n-l,n-l / 

w/iere C _1 ( T ) 6 ) - s/ = Q*R*- 

Proof: This follows directly from Lemma 12.21 and Propositions 14.31 and 14.41 ■ 

5 Convergence to deflation 

Sufficiently thin deflation neighborhoods T>\ c are invariant under F s for s w A.; . 

Proposition 5.1 Given C > 0, there exists e- mv G (0,et u b); such that for any 
e G (0, e inv ) and s G [A, - C e, A; + Ce] we have F s (D l A e ) C int(X> A e/2 ). 

For a simple shift strategy a : 7a — > M, i/iere exists ej nv > smc/i i/iai i/ e G 
(0,e inv ) ften F CT (X> Aj J C int(2? Ae/2 ). 

In particular, F s is well defined in T>\ e for e G (0, £i nv ). 
Proof: Recall that F S (D\ ) = 2? A . From Lemma T2. 31 the derivative of b o is 
zero at X>\ x {A^}. Compactness of T>\ thus implies that in a sufficiently small 
neighborhood of V\ fi x {AJ we have \b(F s (T))\ < |6(T)|/3. 

Now consider a simple shift strategy a: by condition (II) there exists C a > 
such that |(j(T) — Aj| < C a b(T); apply the first statement with C = C a . ■ 

Thus, F CT squeezes neighborhoods T>\ at least linearly. Condition (I) and 
smoothness imply a stronger version of (II). We do not want to assume, however, 
that 2?a,o n S a — 0: after all, this is not true even for Wilkinson's shift. We need a 
more careful statement. 

Lemma 5.2 Consider a shift strategy a and ei nv as in Proposition \5.1l For a 
compact set K, C T>\ e . \ (2? A (1 S a ), there exists C/c such that for all T G JC we 
have \a(T)-\i\< C* K 6(T) 2 . 

Proof: Let IC-p = K, H 2? A0 i en l ar g e along D A0 to obtain another compact 
set /Ci C 2? A o \ Sa, K-v C int-p^ ^(/Ci). Fatten /Ci along fibers to define A^i = 

£ _1 (/Ci x [— e, e]), e G (0, e; nv ), which, without loss, still avoids S a . For each Tp G 
/Ci, consider the function hx^fb) = <t(C _1 (T-d, o)), obtained by restricting a to a 
fiber of T>\ t . Each /ij^ is smooth and even (from condition (I)) and therefore 
satisfies |/it-d(&) — Aj| < Ct d 1 6| 2 . By compactness, there exists such that 
|/iTx,(o) — Ai| < C^Jol 2 for all Fp G /Ci. In other words, there exists Cj^ such that 
W(T) - Ai| < 1 6(F) | 2 for all T G X4. The estimate for F ^ £1 is trivial. ■ 

Proof of Theorem [2) Take e = £i nv as in Proposition 15 . 1 1 so that T>\ e is invariant 
under F a . 

Let (p — bo We compute the Taylor expansion of (p(T, s) at (Fp, A,), T-p G 
2? A0 : from Lemma 12.31 the gradient of (p at (Tx>,Aj) is zero. Thus, up to a third 
order remainder, 

<p(T,s) = <p(T-D,Xi) + ^Pt,t{Tt>, Xi)(T - T V ,T - T v )+ 

+ (p T ,s(Tv,Xi)(T -T v ,s- At) + -<p s , s {T v , Xi)(s - A,,s- A,)+ 
+ Rem 3 (F - T v , s - Xi). 
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Now, ip(Tx>,Xi) — and, again from Lemma 12.31 tp(T, Aj) = for all T £ 7a, 
hence <pt,t(Tt>, Aj) = 0. Let C CT be the constant in condition (II) of the definition 
of a simple shift strategy. By compactness, there exists C\ > such that for all 
T v G 2^1,0! T e V A,e and s£[A. - C,, e, Xi +C a e], we have 

k(T,s)| <d\8- Ai|(||T-T^| + \s- Ai|) 

We now apply this estimate for Tt> — Hi (T) , where T £ T>\ e . By Proposition 14. A\ 
since e < e tu b, \\T - T v \\ = \\T - IU(T)\\ < C b \b(T)\ and therefore 

\<p(T,s)\ < d\s - Xi\(C b \b(T)\ + \s- Ai|) 

implying the quadratic estimate 

\b(F a (T))\ = \<p(T,a(T))\ < C x \a{T) ~ X t \{C b \b{T)\ + \a(T) - < C q \b{T)\\ 

Using Lemma 15.21 instead of condition (II) yields the cubic estimate in (c) . ■ 

As a corollary, we obtain the well-known fact that, near deflation, the rate of 
convergence of Rayleigh's shift is cubic. Similarly, the mixed Wilkinson-Rayleigh 
strategy has cubic convergence. The rate of convergence for Wilkinson's strategy is 
far subtler. 



6 Deflationary strategies 

On the way to prove Theorem [H we construct a larger invariant set for F a . Let 
Ua C 7a be the set of unreduced matrices; for e > 0, let U\ tC — Ua U int(I?A,£)- 
Notice that IAa,h is open, dense and path-connected. 

Lemma 6.1 For a shift strategy a : Fa — > £mv os in Proposition I5.il and e £ 
(0, einv); the open setUA,e is invariant under F a . 

Proof: If T £ Ua and a(T) is not in the spectrum then F a (F) is (well defined 
and) unreduced. If T £ U A and a(T) = A t then F a (T) £ V\ C U A>e . Finally, if 
T £ int(X>X )£ ) then, by Proposition EU F a (T) £ int(V\ e/2 ) cU A>e . ■ 

Notice that we do not assume a or F a to be continuous. 

A simple shift strategy a is deflationary if for any T £ UA,e im there exists K £ N 
such that F*(T) £ X>A, e , nv - 

Rayleigh's strategy is known not to be deflationary. The following well known 
estimate ([B] and [T3], section 8-10) implies that Wilkinson's strategy is not only 
deflationary but uniformly so, in the sense that there exists K with F£ (M A , eiav ) C 
2?A !Cinv . As a corollary, the mixed Wilkinson-Rayleigh strategy is also uniformly 
deflationary provided e > is sufficiently small. 



Fact 6.2 For F £ F and k £ N, 

|6( ^ (T))I - — {vW 1 — ■ 

In [15] . the result is shown for unreduced matrices; the case T £ WA j£inv fol- 
lows by elementary limiting arguments. Notice that for F £ Fa, the numerator 
|6(T) 2 (T)„_i.„_2| is uniformly bounded. 

We now prove Theorem [T] if the shift strategy a : Fa — > K is continuous then it 
is not deflationary. 
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Proof of Theorem [1} Fix e = £i nv /2. Let B l C Ua.e be the basins of attraction 
of each invariant neighborhood T>\ , i.e., T g B l if there exists k G N such that 
F^(T) £ T>\ e . The sets 2? z are clearly disjoint with 2?\ e c B*. If cr is continuous, 
they are also open subsets oiU^ t since Bi — lj fc F~ fe (int(2? A J). If <r is deflationary, 
IJj Si = £/a, c . Thus, if cr is both continuous and deflationary then is not 
connected, a contradiction. ■ 



7 Dynamics of shifts for a.p. free matrices 

From the previous section, cubic convergence may be lost when the orbit F^(T) 
passes near the set S a fl £>a,o- Our next task is to measure when this happens, by 
studying the dynamics associated to a shift strategy in a deflation neighborhood, 
i.e., the iterates of F a : T>\ e —> T>\ e , e £ (0, £i nv ). Most of what we need can be 
read in the projection onto T>\ , where F a coincides with F^. 

A matrix T £ T with simple spectrum is a.p. free if no three eigenvalues are 
in arithmetic progression and a.p. otherwise. Different kinds of spectra lead to 
different dynamics: in this section we handle the a.p. free case, clearly a generic 
restriction. Let T be the leading principal (n— 1) x (n— 1) minor of T. The following 
result is standard. 

Proposition 7.1 Let A £ T be an n x n diagonal a.p. free matrix with spectrum 
Ai < • ■ ■ < A n . For each i, consider F\ i : T>\ — > T>\ as above. For any T £ T>\ , 
the sequence (Ft (T)) converges to a diagonal matrix. 

Proof: The map F\ t on T>\ amounts to a QR step with shift Ai on T, which has 
eigenvalues Aj, j ^ i. The a.p. free hypothesis implies that the absolute values of 
the eigenvalues of T — X J are distinct. If T is unreduced then, as is well known, 
the standard QR iteration converges to a diagonal matrix, with diagonal entries in 
decreasing order of absolute value. More generally, if T is reduced, apply the above 
result to each unreduced sub-block. ■ 

We shall use height functions for the QR steps F s , s near Ai, i.e., functions 
Hi : V\ t -> K with Hi(F s (T)) > H^T) provided T is not diagonal. Such height 
functions and related scenarios have been considered in pQ, [3], [10] and [14] , 

The matrix W — diag(wi, . . . , w n ) is a weight matrix if Wx > ■ ■ ■ > w n . Since A 
is a.p. free, there exists e ap £ (0, ej nv ) such that if s £ Z* = [Ai — e ap , Ai + e ap ] then 
the numbers \Xj — s\ are distinct and their order does not depend on s. 

Proposition 7.2 Let A be an a.p. free diagonal matrix, W a weight matrix and 
e ap as above. For Sh > 0, set rji(x) — log((x — Ai) 2 + Sh) and let Hi : T>\ E — > K 
be defined by Hi(T) = tr (Wrji(T)). There exists Sh > such that 

max H(T) < min H t {T) 

and, for any s £ Xi, Hi is a height function for F s : T>\ e ^ — ¥ T>\ £ ^ . 

Here, r)i(T) = X diag(? ? . i (Ai), . . . , ^(X^X- 1 for T = XAX- 1 so that if p is 
a polynomial and f]i(Xj) = p(Xj) for j = 1, . . . ,n then rji(T) = p(T). The only 
conditions on r\i which will be used in the proof are that |Ay — A»| < |A^ — Ai| implies 
r]i(Xj) < r)i(Xk) and that rji(Xi) is very negative (for small 5h)- 

The proof requires some basic facts about /-Q*i?* steps; these facts will not 
be used elsewhere. For a real diagonal matrix A with simple spectrum, let Oa be 



12 



the set of all real symmetric matrices similar to A; it is well known that Oa is a 
smooth compact manifold. The f-Q+R* step applied to a matrix S G Oa is the 
map Ff : Aaj —> Oa defined by Ff(S) = QISQ+, where Q+ is obtained from the 
factorization f(S) = Q+R+ and S G Aaj if and only if f(S) is almost invertible. If 
T G 7a H Aaj then Ff(T) G 7a (use the same proof as in Lemma [272)1 . The maps 
F s : 7a — > 7a defined above correspond to restrictions of i 7 / for /(a;) = x — s. 

For a continuous function h : R —> R, if 5 G Oa then the matrix function 
belongs to Cm, where M = h(A). With the obvious abuse of notation, we have a 
diffcomorphism h : Oa — > Om provided h is injective in the spectrum of A. 

Lemma 7.3 For h injective in the spectrum of A, consider the diffeomorphism 
h : Oa — > Om, where M — h(A). Let f and f be continuous functions defined in 
neighborhoods of the spectra of A and M , respectively, satisfying f(h(Xj)) = /(Aj) 
for each j with QR steps Ff : Oa Oa and Fj : Om — > Om ■ Then ho Ff — Ft; oh. 

Proof: The hypothesis implies that, for T G Oa, f(T) = f(h(T)) = QR and hence 
Ff{T) = Q*TQ and Fj(h(T)) = Q*h(T)Q. Thus h(F f (T)) = Fj(h(T)). U 

Let I r be the nxn truncated identity matrix, i.e., (I r )i.i = 1 for i < r, other 
entries being equal to zero. 

Lemma 7.4 Let M be a diagonal matrix with simple spectrum and f : R — > R be 
a function for which /ij < fj,j implies \f(pi)\ < \f(jij)\. Consider the f -QR step 
Fj : A M f — > Om- For any S G A M f and r = 1, . . . ,n — 1, tr(I r Ff(S)) > tr(L r S). 
For r = 1, equality only holds if (S)i,j =0 for all j > 1. 

This argument follows closely the first proof in [3] . 

Proof: Let V r be the range of I r and /j, r j(S) be the eigenvalues of the lead- 
ing principal r x r minor of S, listed in nondecreasing order. We claim that 
lir,j{Ff(S)) > n r j(S), which immediately implies ti(L r Ff(S)) > tr(I r 5). Recall 
that Ff(S) = QtSQ-t: where Q+R* = f(S). Let U be an upper triangular matrix 
such that Q±u = f(S)Uu for u G V r . By min-max, 

fc \ ■ (u,Su) 
fi r j(b) — max mm — — 

AcV r u£A-^{0} {U,U) 

dim(A)=r+l-j 

rr?ra\\ ■ <"> i 7( s ) M > • (f(S)Uu,Sf(S)Uu) 
u r i [r ; o = max mm — ; = max mm — = 

J f A u (u,u) A u (f(S)Uu,f{S)Uu) 

. (f(S)u',Sf(S)u>) 
= max mm — = = 

A '=ua weA'^{0} (f(S)u'J(S)u') 

Notice that since U is upper triangular, the map taking A C V r to A' — UA is a 
bijection among subspaces of V r of given dimension. Since S and f(S) are symmetric 
and commute, 

fT? ,o\\ ■ ( u > S 9(S)u) 

Hr,j {■rf(S)) = max mm — , 

w a u (u,g(S)u) 

where g(x) = (f(x)) 2 . The claim now follows from the inequality 

(u,u)(u, Sg(S)u) — (u, Su)(u, g(S)u) > 0. 
Diagonalize S = Q*MQ and g(S) = Q*g(A)Q and write Qu = (xi, . . . , x n ) so that 
2((u,u)(u,Sg{S)u) - (u,Su)(u,g(S)u)) = ^(^fc - ^i){g{^k) ~ g{m))x 2 k xl > 0. 

k,£ 



13 



Consider now equality for the case r = 1. Notice that, by hypothesis, if k =^ I then 
(fik — fii)(g(^k) — <?(/^)) > 0- ln the max-min formula for ti(IiS) = (j,i,i(S), it 
suffices to take u = e±. Equality therefore holds only if Qe\ is a canonical vector, 
which implies (S)ij = for all j > 1. ■ 

Proof of Proposition \7.2[ For all s G Xj and any distinct eigenvalues Xj and 
Afc, \Xj - Ai| < |A fe - Ai| if and only if ??i(Aj) < Tfe(Afc). For s G Z i; /(x) = x - s, 
h(x) = rji(x) and /Uj = rji(Xj), define / : R —> M as in Lemma 17731 The function / 
satisfies the hypothesis of Lemma 17.41 < ^ implies \f(pj)\ < |/(Mfe)|- Thus, by 
Lemma[I2 ti(WFf(S)) > tr(WS) for all S G C S - For T G X>i, 6ap , take S 1 = ft(T): 
by LemmaESl FAh( T)) = h{F f (T)) and therefore tr(W/i(F/(f ))) > tt(Wh(T)). 
Again by Lemma l7.4[ equality happens only if T is diagonal. Thus, Hi is a height 
function. Finally, choosing Sh sufficiently small guarantees that Hi is large in T>\ Q 
and small in &V\ £ap , completing the proof. ■ 

Thus, simple shift strategies admit height functions near the deflation set. Our 
reason for constructing a height function is to control the time the sequence (F~(T)) 
stays in a compact set. 

Assuming A to be a. p. free, for a shift strategy a : T\ ^ R set e a — e ap /(l + C a ) 
(where C a is the constant in the definition of a simple shift strategy). Notice that 
T e V \,e a implies a(T) G Xj, = [Aj - e ap , A, + e ap ]. 

Corollary 7.5 Let A be a real diagonal n x n a. p. free matrix, a a simple shift 
strategy and T>\ £ as above. Let K. C T>\ e be a compact set with no diagonal 
matrices: there exists K G N such that for all T G T>\ t there are at most K points 
of the form F*(T) m K. 

The plan is to take K, containing S a n T>\ : the hypothesis in Theorem [3] that 
diagonal matrices do not belong to the singular support S a is then natural. 
Proof: Let m_ be the minimum jump in K, and m + the size of the image of H^. 

m_= inf Hi(F s (T)) - Hi(T), m+ = sup H(T) - inf H t {T). 
TeK, s^Xi TeV , TeT>\ e 



By Proposition 17.21 and the compactness of JC x I,, s > 0: take AT such that 
Am_ > m + . For a given T, let A = {fc G N | F*(T) G JC}: we have 

m + > £ H(F^(T)) Hi(F*(T)) > |X|m_ 
feex 

and therefore |A| < A. ■ 

Proof of Theorem [3} Let JCi,K,2 C 2?\ 6jr be compact sets with /Ci U/C2 = 6o ., 
5 ff n T>\ disjoint from /Ci and with no diagonal matrices in K,2- By Theorem^ 
there exists C Kl > such that \b{F a {T))\ < C Kl \K T )? for an T E JC\. By Corollary 
17.51 there exists K2 G N such that, given T G T>\ e , at most A2 points of the form 
F£(T) belong to /C2. In particular, there are at most A2 values of k for which the 
estimate \b(F* +1 (T))\ < C Kl \b(F*(T))\ 3 does not hold. ■ 



8 Convergence properties of a.p. spectra 

The aim of this section is to prove Theorem 01 An a.p. matrix T G T with simple 
spectrum is strong a.p. if three consecutive eigenvalues are in arithmetic progression 
and weak a.p. otherwise. 
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In the a. p. free case discussed in the previous sections, for an initial condition 
T G T>\ , the sequence (T) converges to a diagonal matrix; this follows from the 
fact that a(T) Aj for T G V\ e . For weak a. p. spectra, convergence to a diagonal 
matrix may not occur. 

Assume A to be weak a. p. Let &2(T") = Tr l _i.„_2 be the second-last subdiagonal 
entry; for consistency, write b\(T) = b(T). For any i, there exists a unique index 
c(i) such that A c (j) is the eigenvalue closest to A^. As we shall see, if T € T>\ e then 

lim b\ (F^ (T)) = lim b 2 (F*(T)) = 0, lim (F*(T))„, B = A j; ; 

K—tOO K— >-OC K— ¥00 

furthermore, if T is unreduced then 

lim (F CT fe (T))„_ 1 ,„_ 1 = A c(i) . 

We begin with a technical lemma concerning the dynamics of steps F s . Item 
(b) is a variation of the power method argument used to study the convergence of 
lower entries under QR steps. 

Lemma 8.1 Let M — diag(/xi, . . . , fi m ) be a real diagonal matrix with simple spec- 
trum and Tm C T be the manifold of real m x m tridiagonal matrices similar to M . 
Let L C R be a compact interval. Assume that there exists j , 1 < j < m, such that 

Hj £ I, max|/ij — s| < min — s\. 
Let T> J M e C 7m be the j-th deflation neighborhood. 

(a) There exist e > and C G (0, 1) such that for all e' G (0, e) and s G I we have 



(b) Consider Tq G 7a/ unreduced, a sequence (sfc) 0/ elements of I and e > 0. 
Define T k+ \ = F Sk (T k ). Then there exists k such that Tk G 2>L e . 

This will be used to study b 2 (T) for T G 2?X, e j setting I = [A» — e, A» + e], j = c(i), 
M = Aj = diag(Ai, . . . , Ai_i, Aj+i, . . . , A„), with the natural identification between 
7m and V l A . 

Proof: Let C G (0, 1) be such that 

max I it, — s| < C min l/zt — s|. 

Write 

r ( s > > = Tr~\ ' T ~ sI = Q*R*- 

yrCjk )m— l,m— 1 

Recall from Lemma O and Corollary S3] that b(F s (T)) = r(s,T) b(T). We claim 
that for all T G X> M0 and s G 7, |r(s,T)| < C. Since T G X> M0 , |(-R*) m ,m| = \h~ s \- 
Let i?_ be the leading principal minor of i?* of order m — 1: its singular values are 
|/ife — s|, 7^ s. In particular, all singular values are larger that \(R*) m ,m\/C . Thus 

i/ D \ i 1 1 * n ii \ \{R*)m,m hi II (-R*)m,ro 
\{R*)m-l, m -l\ = ||e m _ 1 i?_|| > ^ ||e m _i|| = ^ , 

proving our claim. Take C = (1 + C)/2: by continuity, for sufficiently small e > 0, 
we have \r(s,T)\ < C for all T € V j M e , s G 7. Thus, for T G £> M e and s G 7, 
|6(^ s (r))| < C|6(T)|; item (a) follows. ' 

For item (b), write T k+ i = Q* k T k Q k where T k - s k I = Q k R k is a de- 
composition. Notice that, by hypothesis, 7 is disjoint from the spectrum so that 
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Tq — sqI is invertible. We have (To — So/) -1 = R~ 1 Qq so the rows of Q$ are obtained 
from those of (To — sq!)^ 1 by Gram-Schmidt from bottom to top. In particular, 
Qo e m = cq(Tq — sqI) e m , Co > 0. More generally, we claim that 

Pkd m = c{T - Sfe_i/) _1 ■• • (T - siI)~ x (T - s I)~ l e m , 
c>0, P fc = Q0Q1 • • • Qk-i e SO(m). 

Indeed, by induction and using that T\ — QqTqQo, 

Pke m = c'Q (Ti - Sk-iiy 1 ■■ ■ (Ti - s 1 I)~ 1 e m 
= c'(T - s^xiy 1 • • • (T - si/) _1 Q em 
= c(T - Sfe-i/) _1 • • ■ (T - SxTj -1 ^ - s I)~ l e m . 

For a = 1, . . . , m, let v a be the unit eigenvector associated to fi a . We claim that 

lim P k e m = ±Vj. 

Indeed, write e rn — J2^=i a a v a, where a a = (v a ,e m ) is the last coordinate of v a . 
It is well known that the last coordinates of the eigenvectors v a of the unreduced 
matrix T are nonzero: in particular, cij 7^ 0; assume without loss cij > 0. We have 

Pke m = c(T - Sfc-iTT 1 • • • (T - Si/) _:L (To - So/) _:L e m 
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^ (Ma - Sfc-l) ' ' ' (Ma _ s o) 

Cfc > 0, 6fe !Q = - . 

dj fl a — Sfc-1 fj, a — s 

Since — Sk-i\/\tx a ~ Sfc-i| < C we have |6fe lCt | < (C) h \da/o,j\ and therefore 
lirrifc_ i . 00 bk, a — 0, proving the claim. We have 

lim b(T k ) = lim (T fe ) mjTO _i = lim e*_ x T fe e TO = lim (P fc e m _i)*T (P fc e m ) = 

&j— foo fc— »oo A;— >oo k— >oo 

= lim (Pfce m _i)*/Zj-(Pfce m ) + lim (Pfce m _i)*(T - fJ,jI)(Pke m )- 

k— >co k^roc 

The first limit in the last expression is zero because Pfee m _i _L Pfce m ; the second is 
zero because P k e m -\ is bounded and 

lim (T - HjI)(P k e m ) = (T - ^T) lim (P fe e m ) = (T - Mj-O^j = 0. 

k— >oo k— >oo 



Consider the double deflation set Ca,o C £*a,o C 7a: 

Ca,o = {T e Ta I 61 (T) = 6 2 (T) = 0}. 

For Wilkinson's strategy w, it turns out that the set Ca.o is disjoint from the singular 
support Slj . More generally, if a shift strategy a satisfies Ca,o H S a = then cubic 
convergence of P CT holds even for weak a. p. spectra: this is Theorem 31 which we 
prove below. 

In [5] , we show examples of unreduced tridiagonal 3x3 matrices with spectrum 
— 1, 0, 1 for which Wilkinson's shift converges quadratically to a reduced but not 
diagonal matrix in the singular support S u . Similarly, we conjecture that for strong 
a.p. diagonal n x n matrices A there exists a set X C Ta of Hausdorff codimension 
1 of unreduced matrices T for which (T) converges quadratically to a matrix in 

n 2? A ,o with T n _i )n _ 2 7^ 0. 
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With the natural identification between D\ and 7a ; , we may consider T> 



A;,e 2 



to be a subset of V\ . Let 



For small ei, e% > 0, T G Cjf e2 implies 

T„_ ljn _i « Aj, T^^Ai, &i(T)<ei, 6 2 (T) 0. 
These compact sets turn out to be manifolds with corners but we shall neither prove 



nor use this fact. Lemma I57T1 can be rephrased in terms of the sets C^' 1 



Corollary 8.2 Let A to be weak a.p. spectrum and a be a simple shift strategy. 
There exists e > such that, for all i and for all e\ £ (0, e): 

(a) there exists C £ (0,1) such that, for all sufficiently small 62 > we have 

r <A^A,e 2 ,ei J u L, A,Ce 2 ,e 1 > 

(b) for all unreduced T £ T>\ and for all ei, €2 > there exists k such that F£(T) £ 

Proof: Combine Lemma |8~T1 with IL o F s = F s o IL (Proposition S3]). ■ 

Proof of Theorem [4) From the hypothesis that Ca,o and S a are disjoint it follows 
that, for sufficiently small 61,62 > 0, the shift strategy a is smooth in C^ 1 ^ 1 . As 
in Lemma 15.21 from a Taylor expansion around To £ T> l A , there exists C2 such 
that |cr(T)| < C 2 \bi(T)\ 2 for all T £ £0^- As in the proof of Theorcm[2j there 
exists C 3 such that |6i(i^ (T))| < C 3 |6i(T)| 3 for all T G C^ 6i . From item (a) 
of Corollary 18. 2\ C^e ei is invariant under F a ; from item (b), for all unreduced 
T £ T>\ e (where e is sufficiently small) there exists K such that, for all k > K, 
F*(T) £ C0 2 '* 61 , completing the proof. ■ 



9 Two counterexamples 

In this section we present two examples which show that natural strengthenings of 
Theorems [3] and 2] do not hold for Wilkinson's strategy u. 

We use the notation of Section 3. In Figure [2j where A = diag(l,2,4), we 
indicate a sequence JF* (T) which enters the deflation neighborhood T>\ e near one 
diagonal matrix but travels within the neighborhood towards another diagonal ma- 
trix. Theorem[2]guarantees the cubic decay of the (3, 2) entry whenever F*(T) stays 
away from the singular support S^,. Consistently with Theorem [31 this happens for 
practically all values of k. Notice however that no uniform bound exists on the 
number of iterations needed to reach (a neighborhood of) S u . As proved in [8] , in 
this instance cubic decay does not hold. More precisely, it is not true that given an 
a.p. free matrix A there exist C > and K such that \b{F* +1 (T))\ < C\b{F^(T))\ 3 
for all k > K. 

Consider now the weak a.p. spectrum A = diag(— 1, 0, 0.3, 1) and 

where So £ 7a 3 , A3 = diag(-T, 0, 1), is an example of unreduced matrix obtained 
in [H] for which convergence is strictly quadratic, i.e., 

C_|6(^(5 ))| 2 < |&(^ +1 (So))| < C + \b(F*(S ))\ 2 , 
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S 

Figure 2: We may have F£(T) £ iS w for large values of k. 

for all k, where < C_ < C+. Trivially, the analogous estimate holds for b(F£(To))- 
By sheer continuity, given K, there exists e > such that if T G 7a satisfies 
\\T- To|| < e then 

C1|&(F*(T))| 2 < |6(^ +1 (T))| < C + |6(^(T))| 2 

still holds for all k < K. Thus, the uniform estimate in Theorem [3] fails for weak 
a. p. spectra, even for unreduced matrices. 
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